23 research outputs found

    Compiling and annotating a learner corpus for a morphologically rich language: CzeSL, a corpus of non-native Czech

    Get PDF
    Learner corpora, linguistic collections documenting a language as used by learners, provide an important empirical foundation for language acquisition research and teaching practice. This book presents CzeSL, a corpus of non-native Czech, against the background of theoretical and practical issues in the current learner corpus research. Languages with rich morphology and relatively free word order, including Czech, are particularly challenging for the analysis of learner language. The authors address both the complexity of learner error annotation, describing three complementary annotation schemes, and the complexity of description of non-native Czech in terms of standard linguistic categories. The book discusses in detail practical aspects of the corpus creation: the process of collection and annotation itself, the supporting tools, the resulting data, their formats and search platforms. The chapter on use cases exemplifies the usefulness of learner corpora for teaching, language acquisition research, and computational linguistics. Any researcher developing learner corpora will surely appreciate the concluding chapter listing lessons learned and pitfalls to avoid

    Self-awareness in the dark therapy

    No full text
    Presented work focuses on the proces of self-knowing during the Dark Therapy (DT). The theoretical part is devoted to the modern concept of DT in the Czech Republic and the placement of this therapy into a historical kontext. This section also describes the conditions, course and effects of DT to human psyche and health.. The last section of the theoretical part devotes attention to self-knowing and associated areas. The aim of the practical part is to map out how DT graduates perceive and experience Stay in the dark and whether they relate to better self-knowledge and self-awareness in a relatively short time after graduation. The design of qualitative research is a multiple case study. The data were collected by a half-structured interview with 8 respondents shortly after DT. The data obtained through the interviews was subsequently analyzed in the open coding process, the individual categories are presented by the technique of unloading the cards both in the individual case studies and in their comparison. It was found that motives for DT, personality developmental anchorages and activities performed during DT to some degree determined the experiences and knowledge of the respondents they gained during therapy. Respondents focused in the dark on recapitulation of their life experiences and relationships, which often made it possible to deal with past experiences or rethink their attitudes and opinions. The knowledge of the respondents concerned their dispositions, physical perception, relationship with other people or professional orientation. Frequent evidence is the "mindfulness" of the respondents, who during the stay felt to be spiritually, more conscious, "here and now". Respondents showed an increase in self-acceptance, self-confidence and acceptance of others

    Evaluation of Error Mark-Up in a Learner Corpus of Czech

    Get PDF
    Title: Evaluation of Error Mark-Up in a Learner Corpus of Czech Author: Barbora Štindlová Department: Institute of Czech Language and Theory of Communication, Faculty of Arts, Charles University in Prague Supervisor: prof. PhDr. Karel Šebesta, CSc. Abstract: The thesis deals with the topic of Czech as a second language, while introducing methods of corpus linguistics as applied to texts produced by language learners. The context is the process of building and exploiting a learner corpus, with a focus on its error mark-up and options for evaluating the annotation scheme. Learner corpora have become a major resource for investigating a learner interlanguage and a significant incentive for many different types of research and teaching of second/foreign languages. They are used mainly for contrastive studies of native and non-native speakers, i.e. for contrastive interlanguage analysis, and for computer-aided error analysis of the learner language. This kind of analysis is crucially dependent on the type and quality of the error mark-up. In every error-annotated corpus the error annotation is based on an error typology, which is necessarily problematic from a number of theoretical aspects. Evaluation of the reliability and validity of the annotation scheme design is therefore an important step in the build-up..

    MERLIN: Multilingvální platforma pro evropské referenční úrovně : MERLIN: Multilingual Platform for Common Reference Levels

    No full text
    The paper provides an overview of the motivation, development and major principles of the international project Merlin. The main output of this project is a unique trilingual learner corpus consisting of German, Italian and Czech. The corpus will be available as an online platform illustrating the Common European Framework of Reference for Languages (CEFR) with authentic learner data and enabling users to explore authentic written learner productions and related metadata (e.g. age, first language of the learner, etc.). Each text in the corpus is linguistically analysed during the multiphase error annotation. This process raises some problematic issues concerning the particularly specific character of Czech as a Slavic language. The paper summarises some of these problems and their possible solutions

    MERLIN: Multilingual Platform for Common Reference Levels

    No full text
    The paper provides an overview of the motivation, development and major principles of the international project Merlin. The main output of this project is a unique trilingual learner corpus consisting of German, Italian and Czech. The corpus will be available as an online platform illustrating the Common European Framework of Reference for Languages (CEFR) with authentic learner data and enabling users to explore authentic written learner productions and related metadata (e.g. age, first language of the learner, etc.). Each text in the corpus is linguistically analysed during the multiphase error annotation. This process raises some problematic issues concerning the particularly specific character of Czech as a Slavic language. The paper summarises some of these problems and their possible solutions.19020

    Compiling and annotating a learner corpus for a morphologically rich language: CzeSL, a corpus of non-native Czech

    No full text
    Žákovské korpusy, čili korpusy, které dokumentují jazyk tak, jak jej používají nerodilí mluvčí, poskytují důležité informace pro výzkum osvojování jazyka i pedagogickou praxi. Tato monografie představuje CzeSL – korpus češtiny nerodilých mluvčích, a to na pozadí teoretických a praktických otázek současného výzkumu v oboru žákovských korpusů. Jazyky s bohatou morfologií a volným slovosledem, včetně češtiny, jsou pro analýzu osvojovaného jazyka obzvláště náročné. Autoři se zabývají složitostí chybové anotace a popisují tři vzájemně se doplňující anotační schémata. Věnují se také popisu nerodilé češtiny z hlediska standardních jazykových kategorií. Kniha podrobně rozebírá praktické aspekty tvorby korpusu: proces sběru a anotace, potřebné nástroje, výsledná data, jejich formáty a vyhledávací rozhraní. Kapitola o aplikacích korpusu ilustruje jeho užitečnost pro výuku, výzkum akvizice i počítačovou lingvistiku. Každý, kdo se zabývá tvorbou žákovských korpusů, jistě ocení závěrečnou kapitolu, shrnující úskalí, kterým je třeba se vyhnout.Learner corpora, linguistic collections documenting a language as used by learners, provide an important empirical foundation for language acquisition research and teaching practice. This book presents CzeSL, a corpus of non-native Czech, against the background of theoretical and practical issues in the current learner corpus research. Languages with rich morphology and relatively free word order, including Czech, are particularly challenging for the analysis of learner language. The authors address both the complexity of learner error annotation, describing three complementary annotation schemes, and the complexity of description of non-native Czech in terms of standard linguistic categories. The book discusses in detail practical aspects of the corpus creation: the process of collection and annotation itself, the supporting tools, the resulting data, their formats and search platforms. The chapter on use cases exemplifies the usefulness of learner corpora for teaching, language acquisition research, and computational linguistics. Any researcher developing learner corpora will surely appreciate the concluding chapter listing lessons learned and pitfalls to avoid
    corecore